
- 두가지 특성, 탐색과 활용
- 왜 멀티암드 밴디트 알고리즘을 사용하는가
- 엡실론-그리디 알고리즘
- 밴디트 알고리즘 디버깅
- 소트프맥스 알고리즘
- UCB-상부 신뢰 한계 알고리즘
- 현실에서의 밴디트, 문제의 복잡성과 복합성
- 결론

from IPython.display import Image
Image(filename='png/1.png')
Image(filename='png/2.png')
Image(filename='png/3.png')
Image(filename='png/4.png')
Image(filename='png/5.png')
Image(filename='png/6.png')
Image(filename='png/7.png')
Image(filename='png/8.png')
Image(filename='png/9.png')
Image(filename='png/10.png')
Image(filename='png/11.png')
Image(filename='png/12.png')

Image(filename='png/13.png')
Image(filename='png/14.png')
Image(filename='png/15.png')
Image(filename='png/16.png')
Image(filename='png/17.png')
Image(filename='png/18.png')
Image(filename='png/19.png')
Image(filename='png/power.png')
Image(filename='png/power2.png')
Image(filename='png/20.png')
Image(filename='png/21.png')
Image(filename='png/22.png')
Image(filename='png/23.png')
Image(filename='png/24.png')
Image(filename='png/25.png')
Image(filename='png/26.png')
Image(filename='png/27.png')
Image(filename='png/28.png')
First, in the study, what risk is the participant undertaking? The main threshold is whether the risk exceeds that of “minimal risk”. Minimal risk is defined as the probability and magnitude of harm that a participant would encounter in normal daily life. The harm considered encompasses physical, psychological and emotional, social, and economic concerns. If the risk exceeds minimal risk, then informed consent is required. We’ll discuss informed consent further below.
In most, but not all, online experiments, it can certainly be debated as to whether any of the experiments lead to anything beyond minimal risk. What risk is a participant going to be exposed to if we change the ranking of courses on an educational site, or if we change the UI on an online game?
Exceptions would certainly be any websites or applications that are health or financial related. In the Facebook experiment, for example, it can be debated as to whether participants were really being exposed to anything beyond minimal risk: all items shown were going to be in their feed anyway, it’s only a question of whether removing some of the posts led to increased risk.
Image(filename='png/29.png')
Image(filename='png/32.png')
Next, what benefits might result from the study? Even if the risk is minimal, how might the results help? In most online A/B testing, the benefits are around improving the product. In other social sciences, it is about understanding the human condition in ways that might help, for example in education and development. In medicine, the risks are often higher but the benefits are often around improved health outcomes.
It is important to be able to state what the benefit would be from completing the study.
Third, what other choices do participants have? For example, if you are testing out changes to a search engine, participants always have the choice to use another search engine. The main issue is that the fewer alternatives that participants have, the more issue that there is around coercion and whether participants really have a choice in whether to participate or not, and how that balances against the risks and benefits.
For example, in medical clinical trials testing out new drugs for cancer, given that the other main choice that most participants face is death, the risk allowable for participants, given informed consent, is quite high.
In online experiments, the issues to consider are what the other alternative services that a user might have, and what the switching costs might be, in terms of time, money, information, etc.
Finally, what data is being collected, and what is the expectation of privacy and confidentiality? This last question is quite nuanced, encompassing numerous questions:
Do participants understand what data is being collected about them? What harm would befall them should that data be made public?
Would they expect that data to be considered private and confidential? For example, if participants are being observed in a public setting (e.g., a football stadium), there is really no expectation of privacy. If the study is on existing public data, then there is also no expectation of further confidentiality. If, however, new data is being gathered, then the questions come down to:
What data is being gathered? How sensitive is it? Does it include financial and health data?
For example, often times, collected data from observed “public” behavior, surveys, and interviews, if the data were not personally identifiable, would be considered exempt from IRB review (reference: NSF FAQ below).
To summarize, there are really three main issues with data collection with regards to experiments:
For new data being collected and stored, how sensitive is the data and what are the internal safeguards for handling that data? E.g., what access controls are there, how are breaches to that security caught and managed, etc.?
Image(filename='png/30.png')
Image(filename='png/31.png')